Download Moog Ladder Filter Generalizations Based on State Variable Filters
We propose a new style of continuous-time filter design composed of a cascade of 2nd-order state variable filters (SVFs) and a global feedback path. This family of filters is parameterized by the SVF cutoff frequencies and resonances, as well as the global feedback amount. For the case of two identical SVFs in cascade and a specific value of the SVF resonance, the proposed design reduces to the well-known Moog ladder filter. For another resonance value, it approximates the Octave CAT filter. The resonance parameter can be used to create new filters as well. We study the pole loci and transfer functions of the SVF building block and entire filter. We focus in particular on the effect of the proposed parameterization on important aspects of the filter’s response, including the passband gain and cutoff frequency error. We also present the first in-depth study of the Octave CAT filter circuit.
Download Time-Varying Filter Stability and State Matrix Products
We show a new sufficient criterion for time-varying digital filter stability: that the matrix norm of the product of state matrices over a certain finite number of time steps is bounded by 1. This extends Laroche’s Criterion 1, which only considered one time step, while hinting at extensions to two time steps. Further extending these results, we also show that there is no intrinsic requirement that filter coefficients be frozen over any time scale, and extend to any dimension a helpful theorem that allows us to avoid explicitly performing eigen- or singular value decompositions in studying the matrix norm. We give a number of case studies on filters known to be time-varying stable, that cannot be proven time-varying stable with the original criterion, where the new criterion succeeds.
Download A Direct Microdynamics Adjusting Processor with Matching Paradigm and Differentiable Implementation
In this paper, we propose a new processor capable of directly changing the microdynamics of an audio signal primarily via a single dedicated user-facing parameter. The novelty of our processor is that it has built into it a measure of relative level, a short-term signal strength measurement which is robust to changes in signal macrodynamics. Consequent dynamic range processing is signal level-independent in its nature, and attempts to directly alter its observed relative level measurements. The inclusion of such a meter within our proposed processor also gives rise to a natural solution to the dynamics matching problem, where we attempt to transfer the microdynamic characteristics of one audio recording to another by means of estimating appropriate settings for the processor. We suggest a means of providing a reasonable initial guess for processor settings, followed by an efficient iterative algorithm to refine upon our estimates. Additionally, we implement the processor as a differentiable recurrent layer and show its effectiveness when wrapped around a gradient descent optimizer within a deep learning framework. Moreover, we illustrate that the proposed processor has more favorable gradient characteristics relative to a conventional dynamic range compressor. Throughout, we consider extensions of the processor, matching algorithm, and differentiable implementation for the multiband case.
Download Real-Time Singing Voice Conversion Plug-In
In this paper, we propose an approach to real-time singing voice conversion and outline its development as a plug-in suitable for streaming use in a digital audio workstation. In order to simultaneously ensure pitch preservation and reduce the computational complexity of the overall system, we adopt a source-filter methodology and consider a vocoder-free paradigm for modeling the conversion task. In this case, the source is extracted and altered using more traditional DSP techniques, while the filter is determined using a deep neural network. The latter can be trained in an end-toend fashion and additionally uses adversarial training to improve system fidelity. Careful design allows the system to scale naturally to sampling rates higher than the neural filter model sampling rate, outputting full-band signals while avoiding the need for resampling. Accordingly, the resulting system, when operating at 44.1 kHz, incurs under 60 ms of latency and operates 20 times faster than real-time on a standard laptop CPU.